A Web Aggregation Approach for 
Distributed Randomized PageRank Algorithms* 

Hideaki Ishii 

Department of Computational Intelligence and Systems Science 
Tokyo Institute of Technology 
4259 Nagatsuta-cho, Midori-ku, Yokohama 226-8502, Japan 
(SJ | E-mail: ishii@dis.titech.ac.jp 



O 
(N 



GO 
O 



Roberto Tempo 



In . CNR-IEIIT, Politecnico di Torino 

Corso Duca degli Abruzzi 24, 10129 Torino, Italy 

E-mail: roberto.tempo@polito.it 

0\' 

Er-Wei Bai 

Department of Electrical and Computer Engineering, The University of Iowa 

4316 Seamans Center for the Engineering Arts and Sciences 

Iowa City, IA 52242-1527, U.S.A. 

and School of Electronics, Electrical Engineering and Computer Science 

Queen's University, Belfast, BT7 INN, U.K. 

E-mail: er-wei-bai@uiowa.edu 

>' 
\D ■ March 1, 2013 

o- 

\o 
\q 

en 

f^*i , Abstract 

(N ■ 

The PageRank algorithm employed at Google assigns a measure of importance to each web 
page for rankings in search results. In our recent papers, we have proposed a distributed 
randomized approach for this algorithm, where web pages are treated as agents computing their 
own PageRank by communicating with linked pages. This paper builds upon this approach to 
reduce the computation and communication loads for the algorithms. In particular, we develop 
a method to systematically aggregate the web pages into groups by exploiting the sparsity 
inherent in the web. For each group, an aggregated PageRank value is computed, which can 
then be distributed among the group members. We provide a distributed update scheme for 
the aggregated PageRank along with an analysis on its convergence properties. The method 
is especially motivated by results on singular perturbation techniques for large-scale Markov 
chains and multi- agent consensus. 
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1 Introduction 

When using the search engine Google, the rankings in search results take account of various aspects 
of web pages, but it has been acknowledged that the so-called PageRank algorithm provides crucial 
information for this purpose. This algorithm assigns to each web page a measure of its importance 
or popularity based solely on the link structure of the web. In particular, pages possessing more 
links, especially those from important pages, are given higher PageRank values, increasing the 
chance to be placed on the top of search results (see, e.g., [9tfTT|,[3T] ) . 

One of the main challenges in implementing this algorithm is the size of the web. It is reported 
that the number of web page indices collected at Google is over 10 billion, causing serious issues 
for computation. Numerical methods for PageRank have been a subject of recent research. In 
the adaptive scheme of [33], computational resources are allocated to pages whose convergence to 
the PageRank values is slow. In [38], the problem size is reduced by treating the set of the so- 
called dangling nodes as a single node. The work of |5] employs techniques based on Monte Carlo 
simulation. On the other hand, numerical analysis methods known as asynchronous iterations [6] 
are applied to PageRank algorithms in |2Up36| . In [5T] , a randomized algorithm is proposed based on 
stochastic descent methods with an explicit bound on the convergence rate. Moreover, the variations 
in PageRank values when the link structure changes have been studied from the viewpoint of 
fragile/uncertain links in [28J and also from optimization and linear algebra viewpoints in [19y21] |^|. 

In our recent paper [30] , we focused on this algorithm and developed a distributed randomized 
approach for PageRank computation. From the control theoretic viewpoint, a key observation is 
that the PageRank computation shares several features with the multi-agent consensus problems, 
which have gained much attention in recent years; see, e.g., [HIHI6] and the references therein. 
Thus, we view the web as a network of agents having computation and communication capabilities. 
The idea is to let each web page, or the server that hosts it, compute its own PageRank value 
by communicating with neighboring pages connected by direct links. To realize asynchronous 
communication, it employs the so-called gossip protocol, where the pages randomly determine 
when information should be transmitted. Such a randomization-based method is motivated by the 
recent advances on probabilistic methods in systems and control [45J and has been adopted in the 
literature of multi-agent consensus (e.g., [HdSKnilMllSllIMllllllMlIlT]). In $29\M\, we have also 
considered the effects of communication failures among the pages under this approach. 

This paper aims at generalizing the distributed PageRank algorithms in [30J so that they can 
be used in an environment under limited resources. In particular, we develop efficient algorithms 
by reducing the amount of computation and communication loads. In such an environment, the 
computation of the true PageRank values may be difficult. Consequently, we provide an alternative 
method for finding a good approximate with bounds on the possible errors. 

The proposed approach is based on a novel aggregation method of the original web to reduce 



1 In the area of e-commerce, various methods are known to enhance chances of specific web pages to be placed 
higher in search results, e.g., by including effective keywords in the pages. Such methods are sometimes referred to as 
search engine optimization (e.g., [H]). In these methods, the importance of PageRank is often emphasized through 
adding proper links. 



the size of the problem. The pages are first divided into a number of groups, for example, based 
on the hosts or the domains of the pages. It is known that most of the links in the web are intra- 
host ones [IQ1E7], and thus the underlying graph has certain sparsity properties. To exploit such 
properties, we further aggregate the graph so that each group either (i) has more internal links 
than those going outside or (ii) consists of just one page. The aggregation procedure is easy to 
implement, employing a simple criterion, and can be applied to graphs with any link structures. 
Then, each group computes only one value in a decentralized manner via an enhanced version of 
the algorithms in [30] . This value represents the total value of the group members. Once this is 
computed, it can be distributed among the group members to determine their individual values. It 
is demonstrated theoretically and also through a numerical example that aggregation can reduce 
the computational cost while maintaining the accuracy and the convergence rate at a level similar 
to the non-aggregated full-order case. 

The aggregation technique is particularly motivated by the singular perturbation analyses for 
large-scale systems in Markov chains [2|I43] (see also |16p48j ) and multi-agent consensus type prob- 
lems [7JII5]- It is important to note that common in these works is the strong assumption on 
interactions among groups, requiring all groups to have only limited ratios of outgoing links to- 
wards other groups. This would not hold for the web graph where pages with many external links 
may not be common but indeed do exist. In the proposed aggregation method, such pages are 
treated as exceptions and are separated into groups of their own. We will later discuss the relation 
of our approach to those using singular perturbation techniques in more detail. Aggregation for 
PageRank computation has also been explored to obtain acceptable approximation in [371139] by 
classical methods in the Markov chain literature. The paper [10] examines aggregation through 
extensive simulation using large data of the web. Recent works on aggregation for Markov chains 
can be found, e.g., in [23]. More generally, in the literature of complex networks, partitioning 
graphs into communities is a topic widely studied under various criteria for detecting communities; 
see, for example, |22] and the references therein. 

From the viewpoint of distributed randomized approach, there are two new features in the 
current paper. First, the nodes initiate updates in a random manner as in the original algorithm 
of [30]. The difference is that an updating page transmits its value only to the pages to which it has 
outgoing links; this means that the extra data required in the previous algorithms on pages having 
incoming links towards this page becomes unnecessary. Second, each node can further divide its 
linked pages into several groups and communicates with them separately under possibly different 
update probabilities. These features are useful to reduce the amount of the overall communication, 
especially for pages with many links. 

This paper is organized as follows: We first give a brief overview of the PageRank problem in 
Section [2j In Section [31 we introduce the approach for web aggregation and the communication 
protocol among agents and then formulate the problem of computing the PageRank values via web 
aggregation. In Section [U the main results on the aggregation-based algorithm are presented along 
with an analysis on error bounds. Discussions on the relation to singular perturbation techniques 
are given in Section [5] In Section [61 the distributed randomized approach is developed for the part 



of the proposed algorithm based on the reduced-order recursion. We provide a numerical example 
in Section [7] to illustrate the results. The paper is finally concluded in Section [SJ Parts of the 
results in this paper have appeared in preliminary forms in [31|.I32| . 

Notation: For vectors and matrices, inequalities are used to denote entry-wise inequalities: For 
X, Y G IR nxm , X < Y implies Xij < yij for i = 1, . . . , n and j = 1, . . . , m; in particular, we say that 
the matrix X is nonnegative if X > and positive if X > 0. A probability vector is a nonnegative 
vector v G W 1 such that Y17=i Vi = -'■• ^ matrix X G M nxn is said to be (column) stochastic if it 
is nonnegative and each column sum equals 1, i.e., Yui=l x v = 1 f° r eac h j. Let l n G M. n be the 
vector whose entries are all 1 as l n := [1 • • • 1} T . Similarly, S G M nxn is the matrix with all entries 
being 1. The spectral radius of the matrix X G R nxn is denoted by p{X). 

2 The PageRank problem 

In this section, the PageRank problem is briefly described based on, e.g., [9l fTT| [37]. 

Consider the directed graph Q = (V, £) representing a network of n web pages. Here, V := 
{1, 2, . . . , n} is the set of nodes corresponding to the web page indices while £ C V x V is the set 
of edges for the links among the pages. The node i is connected to the node j by an edge, i.e., 
(i,j) G £ , if page i has an outgoing link to page j. 

The objective of the PageRank algorithm is to assign some measure of importance to each web 
page. The PageRank value of page i G V is given by x* € [0, 1]. The relation x* > x* implies that 
page i has higher rank than page j. The pages are ranked according to the rule that a page having 
more links, especially those from important pages, becomes more important. This is done in such 
a way that the value of one page equals the sum of the contributions from all pages that have links 
to it. Let the values be in the vector form as x* £ [0, 1]™. Then, the PageRank vector x* is defined 

by 

x* = Ax*, x* G [0, If, \ T n x* = 1, (1) 

where the link matrix A = (ay) G M 71X ' 1 is given by £%■ = 1/rij if (j,i) G £ and otherwise, and 
rij is the number of outgoing links of page j. Hence, the value vector x* is a nonnegative unit 
eigenvector corresponding to the eigenvalue 1 of A. 

In general, for this eigenvector to exist and then to be unique, it is sufficient that the web as a 
graph is strongly connected |27p- However, the web is known not to be strongly connected. Thus, 
the convention is to slightly modify the problem as follows. First, to simplify the discussion, we 
redefine the graph, and thus the matrix A, by bringing in artificial links for nodes with no outgoing 
links such as PDF files. This can be done by adding links back to the pages having links to such 
pages. As a result, the link matrix A becomes a stochastic matrix, that is, X/2=i a ij = -*- f° r eac h j- 
This implies that there exists at least one eigenvalue equal to 1. To guarantee the uniqueness of this 
eigenvalue, let m be a parameter such that m G (0, 1), and let the modified link matrix M G M nxn 
be defined by M := (1 — m)A + (m/n)S, where S G M nxn is the matrix whose entries are all 1. 



2 A directed graph is said to be strongly connected if for any two nodes i,j G V, there exists a sequence of edges 
which connects node i to node j. 



Notice that M is a positive stochastic matrbg By Perron's theorem [27], the eigenvalue 1 is of 
multiplicity 1 and is the unique eigenvalue with maximum magnitude. Further, the corresponding 
eigenvector is positive. Hence, we redefine the value vector x* by using M as follows. 

Definition 2.1 The PageRank value vector x* is given by 

x* = Mx*, x* £ [0, l] n , llx* = 1. (2) 

Due to the large dimension of the link matrix M, the computation of x* is difficult. The solution 
employed in practice is based on the power method given by the recursion 

TTl 

x(k + 1) = Mx(k) = (1 - m)Ax(k) + —l n , (3) 

n 

where x{k) £ M. n and the initial vector x(0) 6 M. n is a probability vector. The second equality above 
follows from the fact Sx(k) = l n , k € Z+. For implementation, the form on the far right-hand 
side is important, using only the sparse matrix A and not the dense matrix M. This method 
asymptotically finds the value vector as shown below [27] . 

Lemma 2.2 In the update scheme ([3]), for any initial state x(0) that is a probability vector, it 
holds that x(k) — > x* as k — > oo. 

3 Problem formulation 

In this section, we introduce the problem setting for the distributed computation of the aggregated 
PageRank. Following the randomized distributed approach proposed in [30], we view the web 
as a network of agents having computation and communication capabilities. The focus here is 
to extend the distributed algorithm of [30] so that it can be executed with reduced computation 
and communication to determine approximate values of the exact PageRank. In what follows, we 
present the procedure for aggregation of the web and then introduce the communication protocol. 

3.1 Web aggregation 

In the proposed approach, the original web is aggregated by assigning each page into a number 
of groups and then each group computes one value, which is the sum of the values of the group 
members. We aggregate the pages sharing the following three properties: (i) The pages are placed 
under the same host/server so that their values can be computed together, (ii) Each group has 
a sufficiently large number of internal links. More specifically, pages have more links within their 
own groups than those pointing at pages that belong to other groups having multiple members, 
(iii) Group members are expected to take similar values in PageRank; this may be known from 
past computations and/or the link structure. We will show that the process of grouping can be 
done locally at each host. 



3 In the original algorithm in [5], a typical value for m is reported to be m = 0.15, but no specific reason is given 
for this choice. We will use this value throughout this paper. 



We develop a novel aggregation approach by exploiting sparsity properties that the web inher- 
ently has, as stated by (ii) above. The particular approach has a close relation to the singular 
perturbation analysis for large-scale systems with network structures [21 [15], [33] . The method 
there however requires a stronger sparsity notion on the underlying graph, which seems difficult 
to expect in the web. Hence, necessary modifications will be made in the approach. The relation 
among these papers will be discussed in Section [5j 

First, partition the original web graph Q = (V, £ ) and construct the aggregated graph denoted 
by Q = (V, £) as follows: 

(i) The node set is given by V := {1, 2, . . . , r}, and each node i represents a partition set Ui of 
V, that is, {J { Ui = V and Ui n Uj = 0, Vz 7^ j. We call the set U{ a group of pages. Let r be 
the number of groups, and let n, be the number of pages in group U. Thus, Yll=i ™i = n - 

(ii) The edge set £ = V x V satisfies that if (i%, 12) € £, then (h(i\), hfa)) € £, where h : V — > V 
is the function indicating the group j that the web page i belongs to such that h(i) = j, or 
i G Uj. 

To simplify the notation, without loss of generality, we assume that in the PageRank vector x* , 
the first n\ entries correspond to the pages belonging to group U\, and the following ri% entries are 
for those in group U2, and so on. 

We make the following assumption regarding the grouping. It says that each group should have 
a sufficiently small number of external links compared to internal ones. Recall from ([TJ that rii 
denotes the number of outgoing links of page i, and let n ex t,« be the number of outgoing links from 
page i to groups having more than one page. Following [15], we define the node parameter Si of 
page i by 

5 4 :=^, i = l,...,n. (4) 

m 

Assumption 3.1 Given the bound 5 E (0, 1) on node parameters, each group j satisfies one of 
the following conditions: 

(i) For each page i in group j, it holds that Si < 5. 

(ii) Group j consists of only one page. 

In view of (ii) above, groups with one member are called single groups; denote by r\ the number 
of such groups. These groups represent exceptional pages having high ratios of external links. 

After the groups of pages are determined satisfying the assumptions above, we consider the 
values that represent the groups. For this purpose, in the update scheme, we employ the coordinate 
transformation x(k) := Vx(k) via the matrix V = [Vf V^ ] G M nxn given by 



Vi:=bdiag(l£)€ 



V 2 :=bdi ag ([l ni _ 1 o]-^i ni _ 1 i+ ii )e 



1 - ~t\ c ro(«-Oxn ( 5 ) 



where bdiag(Aj) denotes a block-diagonal matrix whose ith diagonal block is X{. It should be noted 
that V\ and V2 are block-diagonal matrices containing r and r — r\ blocks, respectively. They have 
simple structures, depending only on the sizes rii of the groups. Note that in V2, if the ith group 
is a single one (i.e., ni = 1), then the ith block has the size Oxl, meaning that the corresponding 
column is zero. Moreover, V\ and V 2 are orthogonal: VxV^ = 0. 

The PageRank vector x* and the state x(k) after the transformation are partitioned as 



x(k) 



xi(k) 
x 2 (k) 



Vi 

v-2 



x(k). (6) 



In the first part x\ of the PageRank vector in the new coordinate, the ith entry is the total value 
of the members in group i; this vector x\ is referred to as the aggregated PageRank. In the second 
part x|j eacri entry represents the difference between a page value and the average value of the 
group members. In the distributed algorithm developed in Section El the objective is to compute 
the aggregated PageRank x\ in a recursive fashion via information exchange only among groups. 
After this is completed, the second part x\ should be obtained. It will be shown that in this stage, 
transmissions among pages in different groups is necessary, but only once during the execution of 
the algorithm. Hence, reduced communication load can be expected in particular when r is small. 

Remark 3.2 For a given bound S on the node parameters, a simple grouping procedure for As- 
sumption [3?T] to hold can be described as follows. The pages are initially grouped based on their 
hosts, so the computation of the node parameters Si in Q can be done locally. Any page i whose 
Si does not satisfy the condition (i) is taken out from the group; such pages are treated as single 
groups, for which the condition (ii) applies. Other pages still belong to the same group, and thus 
their parameters Si are updated and then checked whether (i) holds for this new group. These steps 
are repeated until all pages under one host satisfy the assumption. It is clear that for any given 
bound S on the node parameters, this procedure will terminate. It should be noted that there is a 
tradeoff between the parameter S and the number r of groups: Smaller S implies larger r, and vice 
versa. V 

3.2 Communication protocol via random gossipping 

We next discuss the communication among groups in the proposed distributed algorithm. 

For the computation of X\(k), the groups send their values to linked groups. Here, we employ 
a gossip-type protocol, where the groups decide to communicate with their linked neighbors at 
random times. Such a protocol is based on local information only and does not require a common 
clock. It is thus useful in realizing asynchronous algorithms for a network of agents (e.g., [Hll 1 2 1 . 1X3 ], 

In the aggregated graph Q = (V,£), the nodes exchange their values over their outgoing links. 
Denote by Vi the set of indices of the groups having links from node i as 

Vi-.= {jeV: (i,j)eS, i + i). 



Here, we allow node i to communicate with a subset of Vj at a time. This helps to reduce the 
instantaneous communication load especially for nodes having many links. For this purpose, we 
partition V% into the sets V» 1, . . . , Vj iSi , where gi is the number of partition sets, i.e., it holds that 

9i 

Vi = (JVi, e , Vi,,nV M =0, W^j. 
e=i 

For each node i G V, let 77i(fc) G {0, 1, . . . , gi} be the i.i.d. random process that specifies the set of 
nodes to which it sends the value (xi(k))i at time k. That is, 

\ £ if node i sends its value to nodes in Va, i = 1, 2, ... , %, 

7/i(k) = < (7) 

I if node i does not communicate 
for k G Z + . The probability distribution of this process is given as 

a iti = Prob{r H (k)=£}, I = 0,1, . . . , 9i , keZ + . (8) 

The update probabilities a^i G (0, 1) are chosen so as to satisfy the condition 

^cv = l, ieV. (9) 

The main problem studied in this paper can be roughly restated as follows: Design a distributed 
randomized algorithm for computing approximated PageRank values such that (i) the groups com- 
pute x\{k), the total values of their member pages, following the gossip protocol for communication 
and then (ii) from x\{k), the PageRank vector x(k) and, in particular, the values for individual 
pages are obtained. 

We characterize the web aggregation approach in Section 0] along with error analyses for the 
aggregated PageRank. Then, in Section [UJ the distributed randomized algorithm of reduced order 
for computing the group values x\{k) is discussed. 

4 Aggregation-based PageRank computation 

In this section, we present the approach for aggregating the web graph and then propose an ap- 
proximated version of the PageRank that can be computed from a lower-order update scheme. 

4.1 Definition of aggregated PageRank 

We begin by analyzing the centralized update scheme of ([3]) described in Section [2] when the state 
is transformed as x(k) = Vx(k) by ©. Let A := VAV~ l be the link matrix in the new coordinate. 
Partition it in accordance with the dimensions of x\(k) and X2(k) as 



A 



A n A 12 
A21 A22 



(10) 



with An S R rxr . The update scheme in Q can be expressed as 



xi(k + 1) = (1 - m)AuXi(k) + (1 - m)A 12 x 2 (k) + 
x 2 {k + 1) = (1 - m)A 2 iXi(k) + (1 - m)A 22 x 2 (k), 



in 



n 



(11) 
(12) 



where u := Vil n = [ni ■ ■ -n r ] T ; we also used the fact V 2 l n = 0. The initial states are such that 
xi(0) > and l^Txi(O) = 1. The steady state of this scheme is the transformed PageRank vector 
x* given in (|6|). 

Now, to derive an approximated version of the update scheme above, we focus on the character- 
istics of the submatrices Aij. The transformation matrix V in ([5]) has a simple structure, and the 
advantage is that its inverse can be found in an explicit form, which will be useful in our analysis. 
Denote the inverse by W := V~ l . It can be partitioned as W = \W\ W2] where 



Wi :=bdiag — 1 H J € 



-n,- 



Wo 



bdiag 






e 



prax (n—r) 



Again, W\ and W 2 are block-diagonal matrices with r and r — r\ blocks, respectively. Moreover, 
the rows in W 2 that correspond to single groups are zero. It is obvious that V{Wi = I, V\W 2 = 0, 
V 2 Wi = 0, and V 2 W 2 = I. 

Based on the approach studied in [33], the key observation in the proposed aggregation is that 
the matrix A can be decomposed into three parts as 



A = I + A int + A. 



*ext- 



(13) 



Here, the internal link matrix A[ Q t is block diagonal; its ith block is of the size Hi x nj, whose 
nondiagonal entries are the same as those of A, but its diagonal entries are chosen so that the 
column sums are zero. This implies that / + A{ nt is a block-diagonal stochastic matrix. Hence, it 
easily follows that 

ViA int = 0. (14) 

On the other hand, the external link matrix A eX ^ contains all elements in A which are not in the 
block-diagonal A[ nt while its diagonal entries are chosen so that each column sum equals zero. Let 
^4exto be an n x n matrix whose jth column is the same as that of ^4 e xt if page j belongs to a 
non-single group (i.e., with more than one member) and zero otherwise for j = 1, . . . ,n. By the 
definition of W 2 , it is simple to check that 

A ext W 2 = A ext0 W 2 . (15) 

T3|) . (HH), and (fT5|) . the submatrices of A in (fTOjl can be expressed 



By using the facts W = V 



as 



4 n 4 12 
^21 ^22 



V1AW1 ViAW 2 
V 2 AWi V 2 AW 2 



I + ViA ext Wi 
V 2 (A int + A ext )Wi 



ViA cxt0 W 2 
I + V 2 (A int + A ext0 )W 2 



(16) 



For later use, from A 22 , we remove ^4 e xto to obtain the block-diagonal matrix A 22 given by 

A' 22 :=I + V 2 A int W 2 . (17) 

The following results are helpful to justify the approach, as we shall see later. 
Lemma 4.1 (i) The matrix An is stochastic, 
(ii) The matrix A 22 in (|17p has spectral radius smaller than or equal to 1. 
(hi) Under Assumption 13.11 it holds that ||^4 e xto||i < 25. 

Proof: (i) It is clear that An is nonnegative and satisfies ljAn = lJV\AWi = lj . 

(ii) The matrix / + A[ nt is stochastic. In particular, it has r diagonal blocks, which are all 
stochastic. Thus, / + A- mt has at least r eigenvalues equal to 1, and the rest have magnitude less 
than or equal to 1. The transformation matrix V is composed of orthogonal rows, and thus ImV^ = 
(ImVi ) . However, ImV]_ is the left eigenspace of I + ^4; n t corresponding to r of the eigenvalues 1. 
This implies that ImV^ spans the eigenspace for the rest of the eigenvalues of I + ^4i nt ; let A be any 
such eigenvalue. Then, there exists a vector v € C n ~ r such that v T V2(I + A- m t) = Xv T V2. Notice 
V2W2 = I, and thus v T V 2 (I + Ant)^ = \v T , showing that A is an eigenvalue of V 2 (I + A- mt )W 2 
as well. Therefore, we conclude that p(A 22 ) = p(V 2 (I + Ant)^) < 1. 

(hi) For i ^ j, the (i, j) entry of A ext o is nonzero (and equals chj = 1/rij) if and only if page j 
has a link to page i, and moreover pages i and j belong to different groups, each of which having 
multiple group members. By assumption, for each column, the sum of its off-diagonal entries is 
less than or equal to 5, but each column sum equals zero. Hence, the 1-norm of A ext o is bounded 
by 25. □ 

An important implication of (ii) and (hi) of this lemma is that if the node parameter 5 in 
Assumption 13.11 is sufficiently small, then the matrix (1 — m)A22 is stablqj; this is because from 
([16]), we have A22 = I + V^^mt + ^cxto)^ = A' 22 + V2A ext0 W2, where ^4 ext o is proportional to 5; 
recall also that A 22 is a block-diagonal matrix, which will become crucial from the computational 
viewpoint. We now come to the idea of how to approximate the scheme (jlip and (|12|) . First, 
express (fT2j) for X2{k) using its steady state (i.e., X2(k + 1) = X2(k)) as 

x 2 (k) = (1 - m)[l - (1 - m)A 22 ]- 1 A 21 x 1 (k), (18) 

where the matrix / — (1 — m)A 22 is nonsingular. This expression is motivated by the time-scale 
separation in singular perturbation based approaches. Substituting this into the recursion (fTT|) for 
xi(k) yields 

x x (k + 1) = (1 - m){2n + (1 - m)A 12 [I - (1 - m)A 22 \^ A 21 ^x 1 {k) + — u. (19) 

Note that if this recursion is stable, then the steady states of the scheme above with (|18p and (|19p 
become the same as those of (|lip and (|12p : they are equal to the transformed PageRank x* in ([6]). 



A matrix is said to be stable if it is Schur stable, that is, if all of its eigenvalues have magnitude less than 1. 

10 



In this approximate form (|18j) and (j 1 9 [> . the scheme requires the recursive computation of 
only the first state x\(k), whose dimension equals the number r of groups. It thus appears that 
information should be exchanged only among groups. However, we notice that the term Ai2pT — 
(1 — m)A22]~ 1 A2iXi(k) involves the product of vectors of dimension n — r and consequently may 
not be suitable for distributed computation. 

To reduce the computation and communication, we further simplify the scheme by relaxing the 
objective to that of computing the approximated version of the state x(k). Specifically, we modify 
the scheme (|18p and (|19j) above under the assumption that 5 is sufficiently small. The scheme 
consists of three steps and is given as follows. 

Algorithm 4.2 1. Take the initial state 5q(0) € R r as a probability vector. At each time k, 
compute the first state x\(k) € M. r via the reduced-order recursion 

"-* vn 

xi(fc + l) = (1 - m)A n xi(k) + — u. (20) 

n 

2. After the updates for x\{k) converge, compute the second state X2(k) £ W l ~ r by 

x 2 (k) = (1 - m) [I - (1 - m)A 22 ] ~ l A2ix x {k). (21) 

3. The state is transformed back in the original coordinate by 

x(k) = Wx(k) = Wixi(k) + W 2 X2{k). (22) 

In summary, we obtained Algorithm 14.21 which is an approximated version of the the scheme 
in (jlip and (I12p . The approach in the derivation outlined above is (i) to use the steady state of 
X2(k), and then (ii) to assume small 5 so that A\2 and the entries of A22 outside the diagonal 
blocks become small (due to Lemma H]T](iii)). In particular, the original scheme is triagonalized 
by replacing A12 with zeros; as a result, in the first step of the algorithm, only the r-dimensional 
dynamics for the group values remains. Moreover, in the second step, A' 2 2 is block diagonal, so the 
matrix inversion in (|2ip can be done at each group (while the first step is running). The level of 
approximation is guaranteed via detailed analyses provided in Theorems 14.51 and 14.71 in the next 
subsection. 

The convergence of this scheme is outlined below. Similarly to Definition 12.11 for the origi- 
nal PageRank vector x*, let x[ € M. r be the eigenvector of the matrix (1 — m)A\i + (m/n)ulj 
corresponding to eigenvalue 1 as 

x' = \(l - m )A u + — ulj]x[, x\ E [0, l] r , ilx[ = 1. (23) 

L n 

This eigenvector exists and is unique because An is stochastic by Lemma 14.11 (i) and moreover, 
u/n is a positive probability vector by definition; hence, this matrix (1 — m)An + (m/n)ulj. is 
positive stochastic and Perron's theorem |27j can be applied. Then, let 



x' 



where x' 2 := (1 - m) [I - (1 - m)A' 22 ] A 2 {x' 1 . (24) 
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Table 1: Comparison of operation costs with communication among groups 
Algorithm Equation Bound on numbers of operations 

Original © O((2f (A) + n)h) 

Aggregation based ([H O({2f (A u ) + r)k 2 + fo{A cxt ) + n + r) 

dH]) O(2f {A) + 2n + r) 



/o(-): The number of nonzero entries of a matrix 
ki,k2- The numbers of steps in the recursions 

The first part x\ is the approximate of the aggregated PageRank x\ ; with some abuse of terminology, 
it will also be called the aggregated PageRank. Finally, we transform this x' back to the original 
coordinate, and let 

x ' ■= V^x'. (25) 

The update scheme in the algorithm is guaranteed to converge to this approximated PageRank 
vector x' . We state this fact as a proposition, which follows from Lemma [ 



Proposition 4.3 In the three-step update scheme in (|20p - (|22p . for any initial vector xi(0) that is 
a probability vector, it holds that the state x{k) converges to x' in ()25[) as k — > oo. 

We have a few remarks regarding this algorithm from the viewpoint of distributed computation. 
In the first step (|20p . the r-dimensional state x\{k) represents the group values. This step requires 
exchange of states only among groups and not among individual pages. Hence, it is suitable for 
distributed computation. Once it reaches the steady state, the other two steps should be carried 
out. The second step (|2ip requires transmission over most links in the web for communicating 
the (n — r)-dimensional vector A2ixi{k). Nevertheless, the subsequent computation in this step 
as well as the third step (|22p can be done locally within each group. This is because the matrices 
I — (1 — m)A' 2 2, W±, and Wi are all block diagonal. 

Remark 4.4 The computational advantage of the aggregation-based approach can be highlighted 
in terms of its operation cost [25 fl38] . Table Q] summarizes the numbers of operations for the original 
scheme ([3]) and the proposed scheme (|20p - (|22p in Algorithm 14.21 Here, fo(A) denotes the number 
of nonzero entries in the link matrix A. For a sparse matrix, its product with a vector requires 
operations of order 2/o(^4). Also, k\ and k\ are the numbers of steps required for the convergence 
of the recursions; for termination criteria, see, e.g., [33j for the centralized case and [3D] for the 
distributed case. 

For the proposed scheme, the operations that involve interaction among groups via communi- 
cation are shown; other steps can be done decentrally and are of polynomial orders of n, for group 
i. The first step (f20|) requires the computation of A\\ = I + V\A ext W\ and the iteration. For the 
second step (f2Tj) . we counted the multiplication of A2iX\{k). As we discussed earlier, here, the 
matrix A' 2 2 is block diagonal, whose blocks are of the size (rii — 1) x (n, — 1). The inverse of each 
block can be computed by the corresponding group. The same holds for the third step (j22[) . where 
the transformation matrices W\ and W% are also block diagonal. 
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Later in Section through a numerical example, we demonstrate that even with this reduced 
computation cost, the aggregation-based approach exhibits high performance in convergence rate 
and accuracy. In particular, two distributed algorithms are compared: One with the original order 
n and the other with the reduced order r < n. The results show that the errors from the true 
PageRank decrease to comparable levels at similar rates. V 

4.2 Aggregated PageRank and its approximation error 

In this subsection, we present two results establishing error bounds for the update scheme in 
Algorithm 14.21 The results provide useful guidelines on how the aggregation of the web should be 
done. 

The first theorem is based on the sparsity property in the graph Q, represented by the node 
parameter in Assumption 13.11 

Letting e G (0, 1) be a parameter that determines the desired level of approximation, we consider 
the upper bound 5 on node parameters. Aggregate the web so that 5 is sufficiently small that 

me 



5 < 



4(l-m)(l + e)' 



(26) 



Theorem 4.5 Under Assumption 13.11 with the parameter 5 satisfying (|26p . the error between the 
steady state x' in (|25p of the update scheme Algorithm 14. 21 and the PageRank vector x* is bounded 

as 

\\x* - x'Wt < e. (27) 



To prove this theorem, it is useful to consider the scheme given by 

(1 - m)A' 

where the matrix A' is defined as 



xi(k + l) 
x 2 {k + l) 



xi(k) 
x 2 (k) 



m 
n 



A 1 :-- 



A u 







A 21 A. 



22 



(28) 



(29) 



This is a modified version of A by replacing A\ 2 and A 22 with and A' 22 , respectively. Note that 
the matrix (1 — m)A' is stable because by Lemma l4.lt A\\ is stochastic and (1 — m)A' 22 is stable. 
It is straightforward to show that in the scheme (|28p . the state converges to x! in (|24p . Then, the 
vector x' = V~^x! given in (1251) must be such that 



m 



x 



(1 - m)A'x' + — 1„, where A' := V^A'V. 
n 

We start with a preliminary result regarding this matrix A' . 
Lemma 4.6 Under Assumption 13.11 it holds that ||^4 — A'\\i < 4(5. 



(30) 



13 



Proof: By the definitions of A and A', we have A — A' = V 1 (A — A')V . Using the submatrix 
expressions of (fTUj) and (f2Uj) and also (fT7|) . we have 

A - A' = Wiii 2 y 2 + W 2 (A 22 - A' 22 )V 2 = (W X V X + W 2 V 2 )A ext W 2 V 2 = A cxt0 W 2 V 2 , 



where the last equality holds by (J15J) and W\V\ + W 2 V 2 = I. The matrix W\V\ is block diagonal 
in the form of bdiag(l/njl^ i l~.) and is stochastic. Hence, ||WiVi||i = 1. Also, W 2 V 2 = I — WiV±, 
which implies HW2V2II1 < 2. Therefore, by Lemma 14.11 (h). the 1-norm of A — A' can be bounded 

as 

||A-A'||i<||4Kto||i II ^2^2 111 < 45. D 



Proof of Theorem \4-5[' From ([2]) and (|3Up , it follows that 



x * - x > = (l - m )(Ax* - A'x') = (1 - m)[(A - A')x* + A'(x* - x')}. 

Thus, we have [I —(l — m)A'](x* — x') = (l — m)(A — A')x* . Here note that the matrix I— (1 — m) A' 
is nonsingular because (1 — m)A' is stable. Thus, we obtain 

x * - x < = (1 - m )[/ _ (l _ m)^']- 1 ^ - A')x*. (31) 

By the condition (|26p on S, it also holds that 

(l-m)P'Hi < (l-m)[||A||i + ||A / -A||i] < (l-m)(l + 4<5) < 1, 

where the second inequality is due to Lemma 14.61 Hence, from (|31|) . we have 

\\x* - x||i < (1 - m)\\ j^(l - m) fc (A') fc (A - A')x* 



k=0 

00 



< (1 - m) ^[(1 - m)||A'||i] fc ||A - A'||i ||x*||i 

fc=0 

i5(l - m) 



< 



l-(l-m)(l + 4S)' 
Finally, by the bound (|26p on S, we obtain the inequality in (|27p . D 



The theorem exhibits that aggregation is useful in obtaining a good approximate of the PageR- 
ank based on an update scheme of a lower order. For the approximate calculation to be feasible, 
the critical condition is (i) of Assumption 13.11 setting a limit on the ratio of external links for each 
group. It is however clear that in the web, many pages have many external links outside its own 
domain, which will not satisfy this assumption. Such a page should not be grouped with other 
pages, but instead be treated as a group on its own; these pages will then satisfy (ii) of Assump- 
tion 13.11 The proposed aggregation method is closely related to those considered in the context of 
singular perturbation analyses. The differences will be discussed in detail in Section 

We proceed to the second result, which also shows how the web aggregation should be carried 
out from a different perspective. In the proposed scheme, the first step (|20p involves only the state 
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xi, which consists of the total values of each group. Hence, when the coordinate is transformed back 
as x = Wx = W{x\ + W%x 2 in the third step (|2"2"j) . each entry of its contribution W{x\ represents 
the average value of the group to which the corresponding page belongs. This means that if the 
grouping is done so that the values of the members in each group are similar, we can expect that 
the scheme computes x\ with small error. The following theorem provides a quantitative result on 
this intuition. 



Theorem 4.7 If \\x* - Wix*\\i < K, then it holds that 

|b*-Wi£i||i < — . 
m 

Proof: By ©, we have x* = Wx* = W\x\ + W 2 x* 2 . Thus, 

x* - W ± x[ = Wi(5i - x'x) + W 2 x 2 . 



(32) 



(33) 



This means that we shall focus on x\ — x\. Observe that by definition, x\ is part of the equilibrium 
of the recursion in (jlip and ()12p . Also, x[ is the equilibrium of (|20p . Thus, it follows that 

x* - x[ = (1 - to) [Au(xl - x[) + A 12 x*,] . 

Hence, we have [J — (1 — m)Au] (x* — x'-A = A\ 2 x 2 . By Lemma l4.1l (i). An is a stochastic matrix, 
and consequently p((l — m)An) = 1 — to < 1. This implies that (1 — m)An is a stable matrix, and 
hence I — (1 — m)An is nonsingular. As a result, it holds that 



x\ - x[ = (1 - m) [I - (1 - m)A u ] 1 A 12 x* = (1 - m) ]T [(1 - m)A 11 ] k A 12 x* 2 

k=0 

oo 

= (1 - m) j^[(l - m)ViAWi] k ViAW 2 x* 2 , 
where in the last equality we used (fTUj) . Substitution of ([54]) into (j3"3"j) results in 

CO 

Wi(x| - x[) + W^ = J^ {[(! - w)(^iViA)] fc + /}W 2 £ 2 . 

Now it follows that 

WW^xl-x'^+W^W^ 



(34) 



fe=l 



< 



J2[(l-m)(W 1 V 1 A)] k +lH|W 2 ^|| 1 
fc=l i J 

oo 

5^||[(1 -m)(W 1 ViA)] fc || 1 + 1 11^2^11, 
fe=i -I 

OO -i 

J3(l-m) fc + l « = -, 



•fc=l 



where the first equality holds since W\V\A is a stochastic matrix and the second inequality is due 
to the condition HW2X2II1 = ||jc* — Wia?£||i < k. Therefore, we arrive at the bound in (|32|) . D 
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Figure 1: The example web, where the dashed lines indicate the grouping for aggregation 

The condition \\x* — Wi5|||i < k in the theorem may in general be difficult to check because 
it requires global information about PageRank. However, it is possible to convert it to a local 
condition. In fact, a sufficient condition is that, for each page i £ V, the relative error between 
the value x* and the average (Wix\)i of its group satisfies \x* — (Wix^)^ < kx*. Obviously, this 
relative error is zero for any group with only one member. Thus, for more accurate computations, 
we may envision to run an algorithm estimating the local value \x* — (W\Xi(k))i\ in real time; if 
the estimate exceeds a given threshold, then the group should be split into smaller groups, each 
of which having smaller relative errors. On the other hand, the theorem is stated in terms of the 
1-norm of the approximation errors. As we see in the proof, for this particular norm, a fairly tight 
bound is obtained; the reason is that for column stochastic matrices, the 1-norm is always 1. 

4.3 Example 

We present a simple example to illustrate the idea of PageRank via web aggregation. 

Consider the web consisting of six pages shown in Fig. [TJ As a graph, this web is strongly 
connected. The original link matrix A in ([T]) is given by 



A 



The PageRank vector x* in ([2]) can be found as 






1/2 














/2 





1/3 














1/2 





1/3 








/2 





1/3 








1/2 











1/3 





1/2 








1/3 


1/3 


1 






<? = [0.0614 0.0857 0.122 0.214 0.214 0.302] 



(35) 



Pages 4 and 6 have the largest number of incoming links, resulting in large PageRank values. Page 6 
is more advantageous because the pages contributing to its value via links, i.e., pages 3, 4, and 5, 
have larger values than those having links to page 4. In particular, page 1 has the smallest number 
of incoming links and obviously the lowest ranking in this web. 

We now aggregate the web and partition the nodes into three groups (i.e., r = 3) as U\ = {1, 2}, 
^2 = {3}, and U3 = {4, 5, 6}. These are indicated by the dashed lines in Fig. [TJ and the aggregated 



16 



«1,0 




«3,0 



Figure 2: Aggregated graph with the update probabilities a^ for the links (see also Example 16.6 



graph is illustrated in Fig. [2j The nodes IA\ and Uj, contain self-loops while lij, does not since it is 
a single group. Here, the node parameters are S\ = 82 = 1/2, £3 = 1, #4 = 1/3, and £5 = 5q = 0. 
Thus, all the pages satisfy Assumption 13.1 1 bv taking 5 = 0.5. This grouping is also reasonable from 
the viewpoint of Theorem 14.71 because in the true PageRank vector x* , the values of the pages in 
the groups U\ and U3 are relatively close. In this case, the transformation matrix V in ([5]) is 



V 



Vi 



v 2 



1 

L 





1 
010 
0* 1 

L 




1 




1 n 


;i/ 2 





-1/2 ]0 








1" 2/3 
' -1/3 


-1/3 
2/3 


-1/3; 
-1/3 ' 



where the diagonal blocks are indicated by dashed-line boxes. In V2, the third column corresponding 
to the single group U2 is zero. Then, the PageRank after the coordinate transformation, x* = Vx*, 
can be found as 

x* = [ {x\) T I (x* 2 ) T ] T =[ 0.147 0.122 0.731 | -0.0121 -0.0294 -0.0294 ] T . 



Notice that the first state x\ is a probability vector. 

For this grouping, by (|13p . the matrix A can be decomposed as A = I - 
internal matrix A{ nt and the external matrix A ext are respectively 



A int + A ext , where the 



.4 



int. 



-1/2 


1/2,0 








1/2 


-1/2 ' 











"d" H iV 











0T-2/3 





1/2; 





1 1/3 


-1 


1/2; 





1 1/3 


1 


-1 1 

J" 



A 



cxt 



-1/2 , 
0__ _-l/2_' 

" 1/2 [ 

1/2 








1/3 

-r 
1/3 "J 

; 

1/3| 



iextj 

0' 

1/3 _ _0_ 

-1/3" 0" 61 
oo| 
001 



The internal matrix j4j nt consists of the block-diagonal elements of A while the external matrix 
A ex t contains the rest. 
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The link matrix A in ()10p in the corresponding coordinate becomes 



A 



"111 


^12 


= 


_ A21 


-422 _ 



0.5 
0.25 

0.25 



0.333 


0.667 




0.111 
0.889 



-0.167 

0.167 0.111 -0.130 

-0.0833 -0.222 -0.0185 





-0.5 

0.5 





0.333 

-0.333 



-0.5 

0.333 

-0.167 




-0.389 

-0.0556 





-0.5 
-0.5 



A 



21 






-0.167 





0.174 


0.161 


-0.113 


-0.0758 


-0.172 


-0.00177 



The proposed update scheme in Algorithm 14.21 employs the matrices An above and 
I _ (1 _ m )A> 22 

where A' 22 is obtained through (fT7|) using A- mt 

A' 22 = I + V 2 A int W 2 



__0_ 
0|" -0.167" 
' -0.167 



0_ 

-0.5 "1 

-0.5 ' 



Notice that the matrix An is stochastic. Also, A 22 has a block-diagonal structure (different from 
A 22 ) and is a stable matrix. For this scheme, the steady state in the original coordinate is 

x' = Wx' = [0.0566 0.0920 0.125 0.212 0.213 0.302] T . 

Comparing this with the true value x* in (|35p . the error is indeed small as \\x' — x*\\\ = 0.0188. 

5 Discussion on aggregation-based methods 

In this section, we provide some discussion on our results from the viewpoint of aggregation. As 
mentioned in the Introduction, the approach of this paper has been motivated by the singular 
perturbation results of 0[15] for Markov chains and [7J03] for consensus-like problems with sparse 
network structures. We have several remarks in relation to these works. 

In the Markov chain literature (e.g. [40J), the problem of finding the stationary probability 
distribution based on aggregation has been long studied; see, for example, [IS] . The paper [39] 
formalizes a general method for finding the exact distribution, and its application to the PageRank 
computation is discussed in [37J. The approach of [43] can be seen as an interpretation of this 
method from the viewpoint of singular perturbation. 

The papers [2, 43J consider the special case when the chain has the so-called nearly completely 
decomposable structure. In the context of our paper, this means that the external link matrix A ex t 
in (|13p can be bounded as ||-A e xt||i < e ' with a small e' > so that the interaction among different 
groups is weak. If e' is sufficiently small, the recursion (jlip and ()12p can be transformed to the 
singular perturbation form, to which standard results (e.g., [35]) can be applied. Note that in 
Lemma l4.1l (iii). the bound on the external link matrix is for j4 G xto an d not A ext . 
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Similarly, the works of 0[T5] deal with problems for multi-agent systems on consensus. The 
specific setup involves undirected graphs and hence a link matrix which is symmetric and stochastic 
(i.e., not only column stochastic as in the PageRank problem). In these papers, simple transfor- 
mation matrices similar to V in ([5]) have been used. In their results, assumptions are made on 
the node parameters for all pages and also on the average value of the node parameters for each 
group; these can be (roughly) stated as <5j < S for all i and YlieU ^» — € ' f° r au 3 with S, e' > 0. 
The consequence is that An and A21 can be bounded by constant multiples of e' and hence the 
problem becomes similar to the Markov chain case mentioned above. 

It however is important to note that the web may not have such strong sparsity properties as 
those assumed in the abovementioned works. For instance, for a page belonging to a small group, 
one external link can result in a large node parameter. By contrast, the assumption imposed in our 
approach is the condition Si < S only in the case page i belongs to a group consisting of multiple 
members; this condition can be checked easily in the grouping procedure outlined in Remark 13.21 
Thus, the results are applicable to a graph with any structure after appropriately grouping the 
pages. One feature here is the tradeoff between the number r of groups and the node parameter S 
as discussed in Remark l3.2( from Theorem l4.5t we observe that more accurate computation requires 
a larger number of groups, and thus a smaller S. 

Furthermore, it is emphasized that in the aggregated recursion (I2U|) . the link matrix An is a 
stochastic matrix. This is critical in the distributed algorithm in the next section. In contrast, in 
the singular perturbation form, the corresponding matrix is not necessarily stochastic [2J. 

6 Distributed randomized algorithm for aggregated PageRank 

In this section, we construct a distributed randomized scheme for finding the aggregated PageRank 
in the first step of Algorithm 14.21 

To simplify the notation, we rewrite the aggregated PageRank in (|24p as £' := x\ and moreover 
the recursion in the first step ()20p as 

TTl 

£(k + 1) = (1 - m)$£(k) + — u, (36) 

n 

where the link matrix is denoted by $ = (4>ij) ■= An and the state by £(&) := x\(k). 

The objective is to compute the aggregated PageRank £' via the distributed update scheme of 
(|36p in the form given by 

TTl 

£(fc + l) = (l-ro)$„ (fc) £(fc) + -u, (37) 

where £(fc) G W is the state whose initial condition £(0) is a probability vector, and rh £ (0, 1); the 
process rj(k) := [rji(k) ■ ■ ■ rj r (k)] defined in ([7]) determines the communication pattern at time k. 
In this scheme, each group i also computes the time average of its own state £$. Let ip(k) be the 
average of £(0), . . . , £(/e) as 

^ = FTT S M = FTT ^ {k ~ l) + ^ (fc)) • (38) 
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Let a S (0,1], which is called the base probability. Recall that the update probability a»£ in 
([S]) and Q determines the probability that group i transmits to its neighbors belonging to Vie f° r 
£ 7^ 0. Assume that they are chosen so that 

-i 9i 

a i: te a ^2 4>ji,l , £ = 1,2,..., g it ^a^ = l. (39) 

It is noted that by Lemma 14. II (i). the link matrix $ = An is stochastic, and thus Yl\=i 4>ji = !• 
Hence, in this condition, the lower bound a ^ • y ^ on a^ is at most a. 

In (f3"T|) . the distributed link matrices <& 9li ... )9r for g^ € {0, 1, . . . , ft}, i EV, are given by 



(*«,.. 



qi,...,q r Jpi ■- 



i-^jEiev^^ ifft=^0,p = *, (4Q) 

1 if % = 0, p = i, 

otherwise 

for p, i € V. Notice that these link matrices are in accordance with the communication pattern 
specified by i](k), i.e., (^ ) r? (fc))p« > if and only if group i sends its value to group p at time k. 

Then, we can establish some desired properties of the link matrices for the update scheme (|37p 
to converge. These facts are stated in the proposition below. 

Proposition 6.1 For the distributed link matrices $ g in (|4U|) . the following two properties are 

satisfied: 

(i) For each q, the matrix & q is stochastic. 

(ii) The average matrix <I> := £ , [$„(^] can be written as $ = a$ + (1 — a)I. 

Proof: (i) Let (pi € W be the ith. column of <!> and further let (j)\ 6 R r be a vector containing 
only those elements corresponding to the nodes in Vi t £, i.e., (d>\ ) := cp p i Up £ Vi t e and otherwise 
for £ = 1, 2, . . . , gj and p, i 6 V. Notice that 

^ = f>P- ( 41 ) 

e=i 

~(£) 
Now, the ith column of & q depends only on q^, so denoting this column by (f>\ with £ = qi, we 

U if* = 0, 

where ej € M r is the unit vector whose jth element is 1 and the rest are 0. It is now clear that 
4>\ • > because by definition 4>\ > and 1 — a/ai^e Yl g y. fyi — by the choice of a^ in 
Moreover, if £ ^ 0, it follows that 



have by (00 



?'iii 



— ii4"ii. + fi-— E^<) = L 



jev M 
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where we used the fact ||<^ ||i = $Z p y. <Pji- It thus follows that each column of $> q is nonnegative 
and the sum of the elements equals one; this implies that this matrix & q is stochastic. 

(ii) Let 4> i be the ith column of the average matrix <J>. By the distribution of r/(k), it holds that 



9i _ 9i / \ 
i=0 i=lK jG> i4 J 






have fa = acj)i + (1 — a)ej. This holds for i = 


= 1,.. 


. , r, and 


. - a)I. 




□ 



By Q4ip and stochasticity of $, we have 
consequently we obtain <J> = a$ + (1 — a)I. 

We next show that the aggregated PageRank vector £' can be expressed in terms of the dis- 
tributed link matrices. Recall that by definition, £' = x^. From (|23|) . we have that £' is the unique 
eigenvector of the the link matrix given by T := (1 — m)Q + (m/n)ul r with the property 1^' = 1. 
Due to the proposition above, this characterization can be extended using the distributed link 
matrices 5L(fe). Define the modified link matrices by T v (k) := (1 — r ^) ( ^»j(fc) + (fn/n)ulj , and their 
average by 

T:=E[r v{k) ] = (l-m)$ + ^ulJ. 

Take the parameter rh as 

ma , , . 

rh := — . 42 

1 — (1 — a )m 



The following lemma is the aggregated version of Lemma 3.3 in |30j. 

Lemma 6.2 For the parameter rh given in (|42[) . we have the following: 
(i) rh G (0, 1) and rh < m. 
(ii) T=^T+ (1-^)1. 

v / m \ ml 

(iii) The aggregated PageRank vector £' = x\ in (|23() is the unique eigenvector of the average 
matrix T corresponding to the eigenvalue 1. 

Remark 6.3 It is interesting that the choice of rh in (|42p is different from m, but is critical in 
establishing the lemma. In particular, noting that the average matrix T is stochastic, we can 
guarantee that the average state £(k) := E[^(k)] of the update scheme in (|3"7|) converges to the 
desired vector £'. This is because it follows the recursion 

?(fc + i) = r?(fc), (43) 

where £(0) is a probability vector. We however emphasize that the state £(fc) itself does not converge 
to the aggregated PageRank £'. To resolve this issue, it turns out to be essential to introduce the 
time average ~4>(k) as we see next. V 

We are in the position to derive a convergence result for the distributed scheme (|37p under the 
probability allocation in (|8|) for the linked nodes. 

21 



Theorem 6.4 Consider the distributed update scheme in (j37j) and (|38|) . For any update prob- 
abilities a^i £ (0,1], i € V, £ G {0,1,...,^}, satisfying the conditions in ([5U]) . the aggregated 
PageRank £' can be obtained from the time average ip(k) of the states £(&) in the mean-square 
sense as £ , [||-0(/c) — £'|| ] — > 0, k — > oo. 

The proof of this theorem follows from noticing that, in Theorem 3.4 of [30J, to establish 
the mean-square convergence, the properties in Proposition 16.11 are sufficient. In other words, 
for convergence, only stochasticity of the link matrices and the average behavior of the update 
scheme are relevant. The type of convergence guaranteed by the theorem is known as ergodicity 
for stochastic processes [42]. While the general results of [T7] can be applied for the proof, we 
have developed in [3D] a more specific one, which have been useful in extending the algorithm to 
incorporate a stopping criterion there. It was also shown there that the convergence rate is of order 
1/k due to the time averaging. 

The distributed update scheme presented above has the following features: (i) The computation 
performed at each group i includes the updates in the state & in (I3T|) and the time average ipi in (J3"H|) , 
(ii) The communication among the groups is local in the sense that each group communicates only 
over direct outgoing links, as seen from the link matrices in (|40|) . (iii) The amount of communication 
is determined by the process r], which specifies the pattern in the interaction between the pages, 
(iv) At any group, the update probabilities a^ can be allocated to linked groups locally without 
information exchange among groups; one global parameter is a, which is critical for the convergence 
of the proposed algorithrro 



Remark 6.5 Compared to the original scheme in [30] . a significant advantage of the one above is 
that the nodes need to communicate with neighbors only over outgoing links. The identity of such 
links is necessarily contained in their local data. This is observed in the link matrices <& q in (|40p 
where only the columns of <3> (and not the rows) corresponding to the indices qi taking values 1 
are used. In contrast, in [30], the protocol is that the nodes send data over incoming links as well. 
Consequently, we also stress that the implementation of this algorithm is simple. A closer look at 
the matrices <fr q suggests us that group j should just send its current value with some weight to the 
linked groups, where their values are updated by simply adding up the values received at the time. 
Hence, no memory is needed for storing values of other groups. This characteristic is realized by the 
assumption on stochasticity of the distributed link matrices. In contrast, such memory is necessary 
in algorithms using asynchronous iteration [6,20,36]. In fact, the most recently received values of 
all groups having links to the group need to be stored. Hence, the memory size is determined by 
the number of incoming links and may be large for popular groups. V 

The convergence rate of this scheme can be discussed from the viewpoint of its average dynamics. 
As mentioned in Remark I6.3| the average state £(fc) converges to the aggregated PageRank £'. 



5 Practical issues related to implementation of the scheme are outside the scope of this paper. Clearly, for the 
PageRank values reported by page owners to be trusted, some regulations must be enforced. Also, reliability of the 
rankings can be affected by links purposefully added to increase PageRank of certain pages; some works have reported 
methods to detect such web spamming (e.g., [3][37])- Further discussions on this point are given in the footnote of 
the Introduction. 
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Because of the recursion (|43p . the asymptotic rate of convergence is exponential and is dominated 
by the the second largest eigenvalue A2(T) of T in magnitude. By Lemma 16.21 fii). this eigenvalue 
can be bounded as 

1 



. fn ryi 

|A 2 (r)| = -|A 2 (r)| + i--<- 

to ml 



m 



1 



am 



(44) 



where the inequality holds because | A2 (T) | < 1 — to; see [37]. Therefore, the convergence rate 
depends on the base probability a in communication, i.e., more communication implies faster con- 
vergence. On the other hand, it is interesting to observe that the bound (|44p above is independent 
of the choices of the individual update probabilities an as well as the number r of groups. 

For the update probabilities a^ to satisfy the conditions ([39]), one possible choice is the follow- 
ing: 

'l-a if£ = 0, 



Oil 



E 



a- 



jGV; , ^ 



if 



l,---,5»! 



for i € Vi. 



(45) 



In this case, the probability for group i to transmit information to some neighbor is in total equal 
to the base probability: YlT=i a i-£ = a - Further, the probability for communicating with group in 
Vi/ is proportional to the weights Y2 cv, ^i* °^ ^ ne corres P on ding entries in the ith. column of the 
link matrix <J>. Hence, the frequency of communication among groups with more links is higher. 



Example 6.6 We study the distributed algorithm based on the aggregated web in Fig. [2] from 
Section 14.31 Let the communication be such that each group communicates with the neighbors 
separately and the update probabilities a^ are chosen as in (j4"5j) above. Thus, for example, group 1 
has two neighbor sets given by Vi 1 = {2} and Vi,2 = {3}. Their update probabilities will become 
aii = a i,2 = en/2, resulting in the probability of no communication to be 0:1,0 = 1 — a; these 
probabilities are indicated in Fig. [2[ Group 2 also has two links, so let V2,i = {1} and V 2 ,2 = {3}. 
Similarly to the case above, let the update probabilities be 0:2,0 = 1— o, 02,1 = a/3, and 02,2 = 2o/3. 
Finally, the only link of group 3 is to group 2, which forms the group V^i = {2}. We can take 
a 3,o = 1 — o and 03,1 = a. 

IUj) can be expressed as 



The distributed link matrices <& gi )92j?3 in 
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Figure 3: Connectivity matrix with a block di- Figure 4: The original node parameters Si of the 
agonal structure pages 

It is straightforward to check that the properties shown in Proposition 16.11 hold: Each &qi,q 2 ,q 3 is a 
stochastic matrix and the average matrix $ satisfies the relation $ = a$ + (1 — a)I. V 



7 Numerical example 

In this section, we provide a numerical example to illustrate the aggregation-based approach. 

We consider a web with 200 pages (n = 200) whose link structure was randomly generated with 
some level of sparsity. The connectivity matrix representing the links is shown in Fig. [3l where the 
mark x indicates the presence of a link from page j to i. Here, we made 12 dense diagonal blocks, 
each of which consisting of 5 to 30 pages. Further, there are two less dense blocks of pages 1 to 50 
and pages 51 to 90. These blocks are initially considered as groups. The first page in each dense 
block has links from all other pages in the same block, and in particular, pages 1 and 51 are given 
more incoming and/or outgoing links; these pages will have high PageRank values as we will see. 

Following the grouping procedure outlined in Remark I3.2|, we first computed the node parame- 
ters Si for each page i with respect to the groups described above; these are plotted in Fig. HJ This 
parameter can be large when the corresponding page has many outgoing links as pages 1 and 51. It 
can also be large when a page is in a small group, but has external links; this is the case with pages 
173, 194, and 195. However, many pages have no external links, and hence the average value of Si 
is relatively small at 0.0260. We remark that this average is similar to the value reported in [10j . 
which is found from real web data when pages are grouped according to the hosts. 

The grouping procedure is determined by the bound S on node parameters. The relation 
between S and the number r of groups is shown in Fig. The line in the figure is not necessarily 
nonincreasing and is in fact piecewise constant. On the other hand, Fig. [6] exhibits the relation 



between S and the error in the approximated PageRank measured by 



x \\i 



. These two figures 
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Figure 6: Error 

s 



x 1 



in PageRank versus 



clearly show the tradeoff between accuracy in computation and the size of the problem: Smaller 5 
implies smaller error but larger number r of groups. 

To see the errors in the PageRank of the individual pages, we plotted in Fig. [7] the true values 
x* and the approximated values x\ for i G V when 5 = 0.2 is used by x and Oi respectively. This 
computation was done with 71 groups (r = 71). We observe that the error is small in general. 

Finally, we applied the distributed randomized algorithm (|37[) and (|38|) for computing the ag- 
gregated PageRank £' and then the entire PageRank vector x. Here, for the gossip communication 
protocol, the grouping Vj^, i = 1, . . . ,r, £ = 0, 1, . . . , gi, of the neighbors is based on the original 
groups in the block structure of the connectivity matrices. Furthermore, we set the update prob- 
abilities ati ; £ using the formula in (j4"5j) with the base probability a = 0.5. In Fig. [8j sample paths 
of the time average ipi(k) for groups 50 to 59 are shown in solid lines and the true values £,[ of the 
aggregated PageRank in dashed lines. It is clear that the time average converges to the true value. 

We would like to emphasize that the convergence performance of this reduced-order scheme is 
similar to the case without aggregation. To make a fair comparison, we computed the state x(k) 
obtained from the time average ipi{k) and then its overall error from the true PageRank x* . We 
plotted in Fig. [9]the 1-norm of x(k) — x* by the solid line. Then, the same error was computed from 
the time average of the full-order scheme, which can be obtained by setting the node parameter 
very small as 5 = 0.01; the result is shown by the dashed line in the same plot. The convergence 
rates as well as the achieved error levels at the final time are comparable for the two cases. The 
reduced-order case is clearly advantageous since it requires less operation as we have discussed 
in Remark 14.41 More concretely, in the example considered here, the numbers of nonzero entries 
for the link matrices A £ r200x200 and £ n = $ e r71x71 ar6) respectively, f (A) = 2623 and 
fo(An) = 780. In view of Table [H the difference in computation costs is evident. 
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8 Conclusion 

In this paper, we have developed a distributed randomized algorithm for obtaining the PageRank 
values which performs well with reduced computation and communication loads. The approach is 
based on a novel aggregation technique of the web. First, we have proposed a simple procedure for 
grouping pages under the criterion of node parameters. Then, the notion of aggregated PageRank 
has been introduced, based on which approximates of the true values for individual pages can 
be computed. Error bounds on the approximation level have been derived. Moreover, we have 
developed the distributed randomized algorithm of lower order for the computation of aggregated 
PageRank. The advantage of the approach in terms of computation cost as well as convergence 
properties have been discussed in detail and also demonstrated by a numerical example. 

In the future, we will further study aggregation-based methods for PageRank to improve the 
convergence rate of the update scheme and the effects of incorporating multi-level groupings. More 
generally, for large-scale systems, the concept of aggregation is of great importance and may provide 
useful tools for other interesting problems such as smart grid for power distribution. 

Acknowledgement: The authors would like to thank Athanasios C. Antoulas, Fabrizio Dabbene, 
Soura Dasgupta, Shinji Hara, and Jun-ichi Imura for helpful discussions. 
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